Repurposing Drugs for Inhibition against ALDH2 via a 2D/3D Ligand-Based Similarity Search and Molecular Simulation

Aldehyde dehydrogenase-2 (ALDH2) is a crucial enzyme participating in intracellular aldehyde metabolism and is acknowledged as a potential therapeutic target for the treatment of alcohol use disorder and other addictive behaviors. Using previously reported ALDH2 inhibitors of Daidzin, CVT-10216, and CHEMBL114083 as reference molecules, here we perform a ligand-based virtual screening of world-approved drugs via 2D/3D similarity search methods, followed by the assessments of molecular docking, toxicity prediction, molecular simulation, and the molecular mechanics Poisson–Boltzmann surface area (MM–PBSA) analysis. The 2D molecular fingerprinting of ECFP4 and FCFP4 and 3D molecule-shape-based USRCAT methods show good performances in selecting compounds with a strong binding behavior with ALDH2. Three compounds of Zeaxanthin (q = 0), Troglitazone (q = 0), and Sequinavir (q = +1 e) are singled out as potential inhibitors; Zeaxanthin can only be hit via USRCAT. These drugs displayed a stronger binding strength compared to the reported potent inhibitor CVT-10216. Sarizotan (q = +1 e) and Netarsudil (q = 0/+1 e) displayed a strong binding strength with ALDH2 as well, whereas they displayed a shallow penetration into the substrate-binding tunnel of ALDH2 and could not fully occupy it. This likely left a space for substrate binding, and thus they were not ideal inhibitors. The MM–PBSA results indicate that the selected negatively charged compounds from the similarity search and Vina scoring are thermodynamically unfavorable, mainly due to electrostatic repulsion with the receptor (q = −6 e for ALDH2). The electrostatic attraction with positively charged compounds, however, yielded very strong binding results with ALDH2. These findings reveal a deficiency in the modeling of electrostatic interactions (in particular, between charged moieties) in the virtual screening via the 2D/3D similarity search and molecular docking with the Vina scoring system.


Introduction
Substance use disorder is a disease characterized by the uncontrolled intake of certain addictive substances that affects the human brain, psychology, and behavior [1].These substances are able to elicit pleasurable sensations by strongly stimulating the brain's reward system.Alcohol use disorder (AUD), for instance, is the most widespread addictive behavior.The toxic aldehydes (acetaldehyde) produced by alcohol (ethanol) metabolism can damage the morphology and function of proteins [2], DNA [3], organs [4], and tissues [5], leading to serious health issues.For example, it may result in alcoholic hepatitis, liver cirrhosis, liver cancer, or other liver diseases [6,7]; moreover, it increases the potential risk of neurodegenerative disorders, such as neuroinflammation, movement disorders, and cognitive dysfunction [8,9].The types of withdrawal syndrome associated with alcohol include anxiety, limbs tremor, emesis, autonomic nervous system hyperfunction, and possibly more.This explains why it may be simple to develop an addiction but is more challenging to quit [10].
In the human body, acetaldehyde dehydrogenases (ALDHs) play a critical role in the metabolism of aldehydes, converting them into non-toxic carboxylic acids and related derivatives.Of the 19 members of the human ALDH superfamily, ALDH1A1, ALDH1B1, and ALDH2 are the three enzymes most closely involved in acetaldehyde metabolism [11].Due to the extremely low k m (k m < 1 µM) levels and the wide distribution in various tissues, ALDH2 is the most effective one to oxidize and eliminate endogenous and exogenous aldehydes irreversibly [12].Approximately 40% of Asians have an ALDH2 gene polymorphism, and the mutant ALDH2 allele (ALDH2*2) encodes an enzyme with a low activity.Even with the presence of only one ALDH2*2 subunit, heterotetramer ALDH2 exhibits a loss in function [13,14].Individuals with this genetic polymorphism likely suffer from negative physiological reactions (disulfiram-like reactions), including flushing, dizziness, and palpitations, mainly due to the acetaldehyde accumulation in their bodies after alcohol intake.These protective symptoms alert the drinker to avoid alcohol consumption and hence reduce the incidence of alcohol addiction [15].Stress and anxiety are the major symptoms of addictive behavior.Studies on animals have shown that inhibiting ALDH2 enzyme activity can ease anxiety [16], decrease binge eating [17], and reduce the urge to seek out alcohol [18], cocaine [19], and methamphetamine [20].This might be ascribed to the fact that ALDH2 inhibition suppresses the synthesis of drug-associated dopamine [19].ALDH2 is therefore a crucial target for treating AUD and preventing a relapse.
Anti-craving drugs that are being developed for targeting ALDH2, opioid receptors, the GABA (γ-aminobutyric acid) receptor, and the NMDA (N-methyl-d-aspartate) receptor are becoming a highly sought-after field for basic and applied research.Baclofen, gabapentin, topiramate, and ondansetron are still under clinical research due to their side effects and non-specific alcohol abuse [21,22].For the treatment of AUD, there are four clinical drugs: disulfiram, acamprosate, naltrexone, and nalmefene; the former three were approved by the U.S. Food and Drug Administration (FDA) and the latter was approved the European Medicines Agency (EMA) [22,23].Only disulfiram targets ALDH2.However, disulfiram inhibits ALDH1 more effectively than ALDH2 and it has a global inhibitory effect on ALDH2 [24,25].In a variety of organs, ALDH2 protects tissues from damage by reducing oxidative stress and removing toxic aldehydes [26][27][28][29].Daidzin is a naturally occurring product and is a specific inhibitor of ALDH2 (IC 50 = 0.08 µM) [30].Keung et al. synthesized a number of isoflavone analogs, of which CHEMBL114083 (compound 20 in their work) showed the strongest inhibition (IC 50 = 0.04 µM) [30].They also crystallized the complex of ALDH2 with Daidzin at 2.4 Å (PDB code: 2VLE), offering a structural basis for the specificity and affinity of Daidzin against ALDH2 [31].Inspired by this crystal structure, a highly effective ALDH2 inhibitor, CVT-10216 (IC 50 = 0.029 µM), was proposed with a therapeutic potential to suppress heavy drinking and alcohol seeking [18]; it had the ability to inhibit ALDH2 reversibly while resulting in a minimal interference with other enzymes and receptors within the body [18].
Computer virtual screening has become an important solution for accelerating the search of lead compounds for new drug development and drug repurposing from a largescale compound database.Based on the concept that similar molecular structures and/or shapes likely have similar physiological activities and biological functions, similarity search methods with 2D/3D molecular fingerprints and/or 3D molecular shapes have proved to be effective in structure-based virtual screening [32,33].Other techniques, such as molecular docking and molecular simulation, can be combined with the 2D/3D similarity search for a high-throughput virtual screening in a parallel or hierarchical manner [34,35].Using fusidic acid as a reference compound, Pavadai et al. performed the 2D/3D similarity-based virtual screening of steroid-type natural products and hit a few growth inhibitors of Plasmodium falciparum with a low IC 50 and comparable ADME (absorption, distribution, metabolism, and excretion) properties to the reference compound [36].The 2D and 3D similarity-based approach was also used to find potential inhibitors for diverse targets, such as human hexokinase II (HKII) [37], p53-MDM2 [38], and HIV-1 [39].
For the receptor ALDH2, however, there are few reports on the discovery of its potent inhibitors.Wang and coworkers identified five small-molecule inhibitors from a screening of a commercial chemical database (50,000 compounds), and their IC 50 values ranged from 0.5 to 23 µM [40], higher than those for the isoflavone analogs mentioned above.Here, we chose three isoflavone analogs of Daidzin, CVT-10216, and CHEMBL114083 as reference compounds and aim to hit potent inhibitors against ALDH2 from a ligandbased virtual screening of world-approved drugs.A variety of computational approaches, such as the 2D and 3D similarity search, molecular docking, toxicity prediction, molecular dynamics (MD) simulations, and the molecular mechanics Poisson−Boltzmann surface area (MM-PBSA) analysis are applied to evaluate the drug compounds and explore receptorligand interactions at a molecular level.This work is valuable for the further design of potent inhibitors against ALDH2 for therapeutic treatments of alcohol addiction.

Virtual Screening of World-Approved Drugs via the 2D/3D Similarity Search
Given the excellent performance of isoflavone analogs, three compounds of Daidzin, CVT-10216, and CHEMBL114083 were chosen as references in the ligand-based 2D/3D similarity search.The molecular structures of the reference compounds are presented in Figure 1, as well as the half-maximal inhibitory concentration (IC 50 ) against ALDH2.
For the receptor ALDH2, however, there are few reports on the discovery of its potent inhibitors.Wang and coworkers identified five small-molecule inhibitors from a screening of a commercial chemical database (50,000 compounds), and their IC50 values ranged from 0.5 to 23 µM [40], higher than those for the isoflavone analogs mentioned above.Here, we chose three isoflavone analogs of Daidzin, CVT-10216, and CHEMBL114083 as reference compounds and aim to hit potent inhibitors against ALDH2 from a ligand-based virtual screening of world-approved drugs.A variety of computational approaches, such as the 2D and 3D similarity search, molecular docking, toxicity prediction, molecular dynamics (MD) simulations, and the molecular mechanics Poisson−Bol mann surface area (MM-PBSA) analysis are applied to evaluate the drug compounds and explore receptor-ligand interactions at a molecular level.This work is valuable for the further design of potent inhibitors against ALDH2 for therapeutic treatments of alcohol addiction.

Virtual Screening of World-Approved Drugs via the 2D/3D Similarity Search
Given the excellent performance of isoflavone analogs, three compounds of Daidzin, CVT-10216, and CHEMBL114083 were chosen as references in the ligand-based 2D/3D similarity search.The molecular structures of the reference compounds are presented in Figure 1, as well as the half-maximal inhibitory concentration (IC50) against ALDH2.The 2D molecular fingerprints decode the 2D structural fragments of chemical molecules using binary bit vectors to calculate and compare the molecular similarity.Due to its high efficiency and effectiveness, it continues to be the top option for preliminary screening when handling large datasets [41].The Tanimoto coefficient (Tc) is often used as a key indicator for measuring similarity.For two fingerprints, A and B, Tc is defined as the ratio of their intersection to their union (Equation ( 1)) [42,43]: The 2D molecular fingerprints decode the 2D structural fragments of chemical molecules using binary bit vectors to calculate and compare the molecular similarity.Due to its high efficiency and effectiveness, it continues to be the top option for preliminary screening when handling large datasets [41].The Tanimoto coefficient (T c ) is often used as a key indicator for measuring similarity.For two fingerprints, A and B, T c is defined as the ratio of their intersection to their union (Equation ( 1)) [42,43]: where I(A, B) = |A ∩ B| is the cardinality of the intersection of fingerprints A and B and represents the number of features present in both fingerprints; U(A, B) = |A ∪ B| is the union of the two fingerprints and indicates the total number of features present in either A or B. A higher T c value means a greater similarity.We tested four algorithms for generating 2D molecular fingerprints supported in the RDKit toolkit (version 2022.09.5) [44], including MACCS (Molecular ACCess System) Keys [45], RDKit-specific fingerprints, Extended-Connectivity FingerPrints (ECFP4) with a diameter of 4 [46], and Functional-Class FingerPrints FCFP4 (a variant of ECFP) with a diameter of 4. The last two were derived via applying the Morgan algorithm [47] and therefore belonged to a family of Morgan fingerprints (also known as circular fingerprints).
Through the 2D molecular fingerprint similarity search of world-approved drugs (5903 compounds), the top 100 compounds with high T c values for each combination of the algorithm and the reference compound were collected.For instance, the top 100 T c values for MACCS keys, RDKit, ECFP4, and FCFP4 fingerprints ranged from 0.649 to 0.974, 0.512 to 0.743, 0.301 to 0.581, and 0.348 to 0.613, respectively, for the search using Daidzin as a reference (Table S1 in the Supplementary Materials).Note that Daidzin is a world-approved drug (ZINC ID: ZINC004098610); its T c value amounted to 1 and was ignored for a clear presentation of the similarity of other compounds.T c ranges from 0 to 1, regardless of the fingerprint length, and it greatly depends on the used type of fingerprints [48].Therefore, a direct comparison of the T c values from different algorithms/reference combinations appeared to make no sense.
The 2D molecular fingerprinting describes the topological structures, while 3D similaritybased screening takes into account the physical and chemical factors, such as the atomic distance, electrostatic potential, 3D molecular shape, and pharmacological information.The 3D shape of ligands is directly related to the complementarity of the molecular associations between ligands and receptors, and chemical group proximity is a crucial prerequisite for providing a variety of intermolecular interactions.We conducted the 3D similarity-based screening using the algorithms of E3FP [49] and USRCAT [50].E3FP stands for the Extended 3D FingerPrint, inspired by the commonly used ECFP (2D) fingerprint, and it integrates tightly with the RDKit toolkit [44].USRCAT is an extension of the ultrafast shape recognition (USR) algorithm [51,52] with CREDO Atom Types [53].The top 100 T c values for E3FP using Daidzin as a reference ranged from 0.173 to 0.346 (Table S2 in the Supplementary Materials).USRCAT is a method based on the 3D molecular shape, and its similarity is assessed by a similarity score; the score for the Daidzin reference range from 0.190 to 0.297 (Table S2).The 2D and 3D similarity search results for the reference compounds of CVT-10216 and CHEMBL114083 are given in Tables S3-S6, respectively, in the Supplementary Material.Due to the flexibility and complexity of the 3D structures, the T c values and molecular similarity scores with 3D similarity-based approaches are marginally lower than that with 2D molecular fingerprints [54].
In total, we tested six similarity-search algorithms (four for 2D and two for 3D) using three reference compounds.The collected top 100 compounds for 18 algorithms/reference combinations (Tables S1-S6) were merged together, resulting in a subset of 861 compounds from the world-approved drugs.Considering multiple charge states or isoforms at different (pH) conditions, we obtained 1097 compounds for a further assessment via molecular docking and toxicity prediction.

Comparison of Similarity-Search Methods
In order to compare different algorithms for the selection of hit compounds, we calculated the binding affinities (∆E dock ) of the top 100 compounds with the receptor (ALDH2 plus NAD + ) via molecular docking.In the absence of NAD + , the ligand enters into the cofactor-binding domain.For the 2D molecular fingerprint similarity search, ECFP4 appeared to have more of a chance to select a compound with a strong binding strength with ALDH2 than the other methods of MACCS Keys, RDKit, and FCFP4 in most cases, as indicated by a higher probability at more negative ∆E dock values (Figure 2).For instance, a higher probability for ∆E dock ≤ −8 kcal/mol was observed for ECFP4 compared to the other three methods (Figure 2a).When using CVT-10216 as a reference, a similar performance was observed for the probability profiles of ECFP4 and FCFP4 with ∆E dock ≤ −9 kcal/mol (Figure 2b).The MACCS Keys fingerprint showed a bad perfor- mance with Daidzin and CVT-10216 as reference molecules (Figure 2a,b), while it appeared to be better than the fingerprinting methods of RDKit and FCFP4 (Figure 2c).The good performance of ECFP4 may be ascribed to the fact that it describes detailed information on the structure of atoms and their neighborhoods, such as the number of direct connections of non-hydrogen atoms, the atomic number, atomic charge, and number of connected hydrogen atoms [46].

Comparison of Similarity-Search Methods
In order to compare different algorithms for the selection of hit compounds, we calculated the binding affinities (∆Edock) of the top 100 compounds with the receptor (ALDH2 plus NAD + ) via molecular docking.In the absence of NAD + , the ligand enters into the cofactor-binding domain.For the 2D molecular fingerprint similarity search, ECFP4 appeared to have more of a chance to select a compound with a strong binding strength with ALDH2 than the other methods of MACCS Keys, RDKit, and FCFP4 in most cases, as indicated by a higher probability at more negative ∆Edock values (Figure 2).For instance, a higher probability for ∆Edock ≤ −8 kcal/mol was observed for ECFP4 compared to the other three methods (Figure 2a).When using CVT-10216 as a reference, a similar performance was observed for the probability profiles of ECFP4 and FCFP4 with ∆Edock ≤ −9 kcal/mol (Figure 2b).The MACCS Keys fingerprint showed a bad performance with Daidzin and CVT-10216 as reference molecules (Figure 2a,b), while it appeared to be be er than the fingerprinting methods of RDKit and FCFP4 (Figure 2c).The good performance of ECFP4 may be ascribed to the fact that it describes detailed information on the structure of atoms and their neighborhoods, such as the number of direct connections of non-hydrogen atoms, the atomic number, atomic charge, and number of connected hydrogen atoms [46].As a molecular-shape based approach, USRCAT outperformed the 3D molecular fingerprint E3FP (Figure 2d-f).It is well acknowledged that a configurational match is one of the decisive factors favoring molecular associations between binding partners, like the binding of an antigen to an antibody.This may explain the good performance of USRCAT in finding potent inhibitors from a 3D structural view.
Binding affinities (∆Edock) with the receptor (ALDH2 plus NAD + ) amounted to −9.1 ± 0.5, −10.4 ± 0.7, and −8.0 ± 0.1 kcal/mol for the three reference compounds of Daidzin, CVT-10216, and CHEMBL114083, respectively.CVT-10216 showed the strongest binding result with ALDH2, in agreement with the IC50 values (Figure 1), whereas the Vina scoring failed to reproduce the relative binding strengths of Daidzin and CHEMBL114083.In this work, we aimed to hit compounds with a binding strength similar to or stronger than that of the As a molecular-shape based approach, USRCAT outperformed the 3D molecular fingerprint E3FP (Figure 2d-f).It is well acknowledged that a configurational match is one of the decisive factors favoring molecular associations between binding partners, like the binding of an antigen to an antibody.This may explain the good performance of USRCAT in finding potent inhibitors from a 3D structural view.
Binding affinities (∆E dock ) with the receptor (ALDH2 plus NAD + ) amounted to −9.1 ± 0.5, −10.4 ± 0.7, and −8.0 ± 0.1 kcal/mol for the three reference compounds of Daidzin, CVT-10216, and CHEMBL114083, respectively.CVT-10216 showed the strongest binding result with ALDH2, in agreement with the IC 50 values (Figure 1), whereas the Vina scoring failed to reproduce the relative binding strengths of Daidzin and CHEMBL114083.In this work, we aimed to hit compounds with a binding strength similar to or stronger than that of the reference molecules.We therefore used a cutoff of ∆E dock = −10 kcal/mol for a further selection of potential inhibitors.
The 2D/3D similarity search hit 33 compounds (∆E dock ≤ −10 kcal/mol) with distinct ZINC IDs from world-approved drugs.We were able to select 4, 24, and 17 drugs with the reference molecules of Daidzin, CVT-10216, and CHEMBL114083, respectively (Table 1).It appeared that when using CVT-10216 as the reference, more compounds with strong binding affinities with ALDH2 could be singled out.ECFP4, FCFP4, and USRCAT helped hit 14, 14, and 13 compounds, respectively, and they performed better than the other methods of MACCS Keys, RDKit, and E3FP, in line with the results stated above (Figure 2).The performances of the 2D molecular fingerprint methods of MACCS Keys and RDKit appeared to not be sensitive to the used references, and they yielded few hits (Table 1).6) 33 (12) Some compounds can be hit using more than one method or reference.The values in parenthesis are the numbers of compounds that can only be hit using the corresponding method(s)/reference(s).
During our tests, the ECFP4 (2D), FCFP4 (2D), and USRCAT (3D) similarity search methods showed an overall good performance.Note that some compounds could only be hit by one single method, as indicated by the numbers in parenthesis in Table 1.Using CVT-10216 as a reference, for instance, Sorafenib N-Oxide (ZINC003817152) and N-Desmethyl Imatinib (ZINC021981222) could only be selected via ECFP4.Similarly, FCFP4 hit both enantiomers of Sarizotan (R-type: ZINC000021067; S-type: ZINC000006990) with the reference CVT-10216; the 3D USRCAT method generated six hits (Table 1).In general, it is better to utilize different search algorithms and principles to capture the key and abundant physical and chemical characteristics for obtaining hits in the virtual screening of a large chemical database [34,[55][56][57][58].

Molecular Docking Prediction
Considering the different charge states and isomers, we obtained 42 hits in total from the docking predictions with binding affinities (∆E dock ) ≤ −10 kcal/mol (Table 2).Eltrombopag (ZINC011679756, q = −3) showed the strongest binding strength with ∆E dock = −11.2kcal/mol.Two more compounds of Indacaterol-8-O-Glucuronide (ZINC049783754) and Eltrombopag (q = −2) showed a binding of ∆E dock ≤ −11 kcal/mol.Note that the compounds were neutral (q = 0), if not stated otherwise.Isomer I of pranlukast (-IA and -IB; ZINC001542146) displayed a stronger binding result than its isomer II (-IIA and -IIB; ZINC015919406) by 0.7 kcal/mol.For the different charge states of Eltrombopag (ZINC011679756), Netarsudil (ZINC113149554), Troglitazone (ZINC000968278), and 5-O-Desmethyldonepezil (ZINC013449462), the Vina scoring [59] was able to generate different binding affinities.However, the Vina docking predicted identical binding affinities for the neutral and positively charged (q = 1) compounds of 6-O-desmethyldonepeil (ZINC013449412) and Sequinavir (ZINC026985532).This finding implies that the Vina scoring failed to predict the relative binding strengths for the compounds with different charge states in some cases.It may be related to an issue of not using atomic charges directly for the evaluation of electrostatic interactions in the Vina scoring, as stated in the previous work [60].
Table 2 also lists the Tanimoto coefficients (for the first five methods based on 2D/3D molecular fingerprints) and similarity scores (for USRCAT) using CVT-10216 as the reference molecule.The values using different references are presented in the Supplementary Materials (Tables S1-S6).The last column in Table 2 indicates the used references that are able to single out the hits.Daidzin helped hit four compounds of ZINC000256630457, ZINC000057674 (Flavone), ZINC049783754 (Indacaterol-8-O-Glucuronide), and 17-Alpha-Estradiol-3-Glucuronide (ZINC013515303), as indicated by "Y/X/X" (X = Y/N) in Table 2; the first one could be selected by only using the USRCAT method (Table 1) with the reference of Daidzin ("Y/N/N" in Table 2).CVT-10216 and CHEMBL114083 helped with the selection of 18 and 8 compounds, as indicated by "N/Y/N" and "N/N/Y", respectively (Table 2).Detailed information on the hits using different methods or references are given in the Supplementary Materials (Table S7).Of the 42 hits, only Flavone (ZINC000057674) could be selected using all of the three reference molecules, and this could only be performed via the FCFP4 method, as indicated by "Y/Y/Y" in Table 2 and Table S7.ZINC ID is the compound ID in the ZINC database, q is the net charge (e), and ∆Edock is the binding affinity with ALDH2 from the docking predictions.The Tanimoto coefficients for the molecular fingerprint similarity search using the 2D methods of MACCS Keys, RDKit, ECFP4, and FCFP4 and the 3D method of E3FP were given for the selected compounds.The column "USRCAT" lists the 3D similarity scores using the USRCAT approach.The values for the Tanimoto coefficients and similarity scores correspond to the drugs at the reference pH and are calculated using the CVT-10216 as the reference compound.The last column indicates the reference compounds (Daidzin/CVT-10216/CHEMBL114083) used to hit drugs.For instance, N/Y/Y means this compound can be hit using CVT-10216 or CHEMBL114083 as the reference compounds (Y, short for yes), but it cannot be hit using Daidzin as the reference (N, short for no).The results for the 2D/3D similarity search using different reference compounds are presented in the Supplementary Materials (Tables S1-S6).Pranlukast (q = −1 e) has four isomers that are named Pranlukast-IA, Pranlukast-IB, Pranlukast-IIA, and Pranlukast-IIB, so do Imatinib, Fexofenadine, and 5-O-Desmethyldonepezil.ZINC ID is the compound ID in the ZINC database, q is the net charge (e), and ∆E dock is the binding affinity with ALDH2 from the docking predictions.The Tanimoto coefficients for the molecular fingerprint similarity search using the 2D methods of MACCS Keys, RDKit, ECFP4, and FCFP4 and the 3D method of E3FP were given for the selected compounds.The column "USRCAT" lists the 3D similarity scores using the USRCAT approach.The values for the Tanimoto coefficients and similarity scores correspond to the drugs at the reference pH and are calculated using the CVT-10216 as the reference compound.The last column indicates the reference compounds (Daidzin/CVT-10216/CHEMBL114083) used to hit drugs.For instance, N/Y/Y means this compound can be hit using CVT-10216 or CHEMBL114083 as the reference compounds (Y, short for yes), but it cannot be hit using Daidzin as the reference (N, short for no).The results for the 2D/3D similarity search using different reference compounds are presented in the Supplementary Materials (Tables S1-S6).Pranlukast (q = −1 e) has four isomers that are named Pranlukast-IA, Pranlukast-IB, Pranlukast-IIA, and Pranlukast-IIB, so do Imatinib, Fexofenadine, and 5-O-Desmethyldonepezil.

Assessment via a Toxicity Evaluation
The liver serves as the main detoxification organ for drugs and the principal metabolic site upon alcohol intake.Therefore, an essential selection criterion is that potential drugs are expected to have less toxic outcomes and side effects.We assessed the toxicities of the selected 42 potential ALDH2 inhibitors (Table 2) using the ProTox-II online tool [47].The compounds that were predicted to be non-toxic were marked as N and the toxic compounds were marked as Y; a confidence estimate for the prediction is given in parenthesis (Table 3).Organ toxicity (hepatotoxicity) and four toxicological endpoints (cytotoxicity, mutagenicity, carcinogenicity, and immunotoxicity) were predicted and are presented in Table 3. Hepatotoxicity indicates liver dysfunction or liver damage in association with an overload of drugs or xenobiotics [47].The compounds that were predicated to be toxic with a confidence value lower than 0.6 or nontoxic for all of the tested toxicities were selected for the subsequent analysis.For the FDA-approved drugs, only Sequinavir, Fexofenadine, and Naproxen met such toxicity criteria (Table 3).Fexofenadine (ZINC003824921; denoted as Fexofenadine-I) was already investigated in our previous work [60], and it was used for a comparison with the hits in this work.Based on the toxicity assessment, we selected 15 compounds in total for the MD simulations, including four isomers of Pranlukast (IA, IB, IIA, and IIB; q = −1), different charge states of Sequinavir (q = 0/+1), Netarsudil (q = 0/+1), and Troglitazone (q = 0/−1), both enantiomers (q = +1) of Sarizotan, Zeaxanthin, Fexofenadine-II (ZINC003872566), and Naproxen (Table 3).

MD Simulation and Binding Energy Calculation
We performed 30 ns molecular dynamics (MD) simulations of the ALDH2 tetramer in the presence of the selected inhibitors and cofactor NAD + .The root-mean-square deviations (RMSDs) of backbone atoms of the ALDH2 monomers and a tetramer from the crystal structure as a function of the simulation time for the complexes with Netarsudil (q = 0/+1) are presented in Figure 3.For the neutral and positively charged states, the RMSDs of the tetramer were well converged and amounted to 0.13 and 0.15 nm, respectively during the MD simulations, revealing that the protein structures were well maintained (Figure 3a,b).Compared to the apo (ligand-free) form of ALDH2, the binding with the positively charged Netarsudil resulted in a greater fluctuation of the overall tetramer structure, while no significant changes in the RMSDs for the corresponding four monomers were observed (Figure 3 and Table 4).The RMSD values for the ALDH2 complexes with the selected 15 inhibitors as well as the three reference molecules are tabulated in Table 4.Although ligand binding may yield a large RMSD value of ca.0.2 nm in some cases, the ALDH2 tetramers can be well maintained with an RMSD of ≤0.16 nm (Table 4).The RMSDs of Netarsudil (q = 0/+1) from their initial configurations (i.e., docking poses) are presented in Figure 3c,d; the values for other compounds are given in Table S8.Similar to the protein backbones, the ligand structures tended to achieve an equilibrium state upon binding with the receptor after 10 ns MD simulations, as indicated by the small standard deviations (0.01-0.05 nm) for the ligand RMSDs in Table S8.In almost all cases, ALDH2 and the ligand structures achieved an overall stable state during the last 10 ns simulations, and the trajectories (i.e., snapshots of receptor-ligand complexes) in this time interval were used for the calculation of binding energies upon the association of receptor and ligand molecules.Table 4. Root-mean-square deviations (RMSDs) of protein backbones of the ALDH2 tetramer and the corresponding monomers (chains A-D) from the crystal structure and the binding energies (∆E bind ) between the receptor (ALDH2 plus NAD + ) and selected inhibitors.The selected inhibitors are taken from Table 3 with low toxicities.q is the net charge (e) of the inhibitors.The last 10 ns simulation trajectories were used to compute the RMSDs and ∆E bind .<∆E bind > is the averaged binding energy over the monomers (chains A-D) and is weighted by their Boltzmann factors.Standard deviations for RMSDs are less than 0.01 nm and are not presented here.Block averaging was used for the binding energy calculations for obtaining good statistics.The data for Daidzin and the ligand-free form of ALDH2 were taken from Ref. [60].

MD Simulation and Binding Energy Calculation
We performed 30 ns molecular dynamics (MD) simulations of the ALDH2 tetramer in the presence of the selected inhibitors and cofactor NAD + .The root-mean-square deviations (RMSDs) of backbone atoms of the ALDH2 monomers and a tetramer from the crystal structure as a function of the simulation time for the complexes with Netarsudil (q = 0/+1) are presented in Figure 3.For the neutral and positively charged states, the RMSDs of the tetramer were well converged and amounted to 0.13 and 0.15 nm, respectively during the MD simulations, revealing that the protein structures were well maintained (Figure 3, panels a and b).Compared to the apo (ligand-free) form of ALDH2, the binding with the positively charged Netarsudil resulted in a greater fluctuation of the overall tetramer structure, while no significant changes in the RMSDs for the corresponding four monomers were observed (Figure 3 and Table 4).The RMSD values for the ALDH2 complexes with the selected 15 inhibitors as well as the three reference molecules are tabulated in Table 4.Although ligand binding may yield a large RMSD value of ca.0.2 nm in some cases, the ALDH2 tetramers can be well maintained with an RMSD of ≤0.16 nm (Table 4).The RMSDs of Netarsudil (q = 0/+1) from their initial configurations (i.e., docking poses) are presented in Figure 3 (panels c and d); the values for other compounds are given in Table S8.Similar to the protein backbones, the ligand structures tended to achieve an equilibrium state upon binding with the receptor after 10 ns MD simulations, as indicated by the small standard deviations (0.01-0.05 nm) for the ligand RMSDs in Table S8.In almost all cases, ALDH2 and the ligand structures achieved an overall stable state during the last 10 ns simulations, and the trajectories (i.e., snapshots of receptor-ligand complexes) in this time interval were used for the calculation of binding energies upon the association of receptor and ligand molecules.After removing water molecules and ions from the simulation trajectories, 100 receptorligand frames were subjected to the molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) analysis for the binding energy calculations.The cofactor NAD + was adjacent to the ligand-binding domain of ALDH2, and it was regarded as one of the receptor residues in the calculation.There were differences in the binding energies (∆E bind ) of different monomers (Table 4), and we used the Boltzmann factor as a weight [61,62] to compute the averaged binding energies (<∆E bind >) over the four monomers (Equation ( 2)): where i indicates the monomer chain (A-D), R is the ideal gas constant, and T is the temperature (K).The monomer with a lower ∆E bind had a heavier weight and <∆E bind > was therefore very close to the lowest value of the four monomers.The binding energies (<∆E bind >) between the receptor (ALDH2 plus NAD + ) and ligand for the three reference molecules of Daidzin, CVT-10216, and CHEMBL114083 amounted to −26.3, −35.5, and −23.5 kcal/mol, respectively (Table 4).Positively charged compounds yielded binding energies of −65.4 kcal/mol for Sequinavir and ca.−60 kcal/mol for R/S Sarizotan and Netarsudil.Neutral drugs of Zeaxanthin and Troglitazone presented predictions of −45.9 and −40.3 kcal/mol, respectively, showing a stronger binding strength than CVT-10216.Neutral states of Sequinavir and Netarsudil produced binding energies of −27.5 and −24.5 kcal/mol, respectively, similar to that with Daidzin and CHEMBL114083.The other six compounds with a negative charge (q = −1) showed near-zero or positive values for ∆E bind , implying that the complexation with these ligands were thermodynamically unfavorable.
The binding energy (∆E bind ) can be further divided into four contributions from van der Waals (∆E vdw ), electrostatic (∆E elec ), polar (∆G polar ), and nonpolar (∆G nonpolar ) interactions; the sum of the first two is the MM part (∆E MM ) and the last two provide the solvation contribution (∆G sol ).In all cases, ∆E vdw and ∆G nonpolar favored ligand binding, whereas ∆G polar showed the opposite (Table 5).Due to the great contribution of ∆G polar , the solvation part (∆G sol ) did not favor the binding; this was probably related to the (unfavorable) desolvation of receptor and ligand molecules upon complexation [61][62][63].The receptor had a negative net charge (q = −6 for ALDH2), and the unfavorable binding with negatively charged inhibitors (mentioned above) may have been due to the electrostatic repulsion of the like sign between the binding partners, as indicated by the positive ∆E elec (Table 5).On the contrary, the positively charged compounds (q = +1) contributed considerably (∆E elec < −70 kcal/mol) due to the electrostatic attraction, giving rise to the strongest binding outcome with ALDH2 (Tables 4 and 5).The decomposition energies are for the monomers with the lowest binding energies (Table 3).Column 2 indicates whether the compound was approved by the FDA or not.q is the net charge (e) of the compounds and q = −6 e for the ALDH2 monomer.Different pH values include the Reference (ref, pH = 7.4), Middle (mid, 6.4-8.4),Low (lo, 5.4-6.4), and High (hi, 8.4-9.4)conditions.∆E MM is the MM part and is the sum of ∆E vdW and ∆E elec .∆G sol is the solvation part and amounts to ∆G polar plus ∆G nonpolar .The binding energies for the three reference compounds are also presented at the bottom; the energies for Daidzin were taken from Ref. [60].

Identification of the Key Residues for Ligand Binding
After the assessments via a similarity search, molecular docking, toxicity prediction, and MM-PBSA analysis, the five compounds of Sequinavir (q = +1), Sarizotan (R/S; q = +1), Netarsudil (q = 0/+1), Zeaxanthin (q = 0), and Troglitazone (q = 0) were hit and used to investigate receptor-ligand interactions at a molecular level.Compared with the crystal ALDH2/Daidzin complex, Netarsudil was not allowed to penetrate deep into the hydrophobic ligand-binding tunnel (Figure 4a), and its strong binding with ALDH2 was attributed to the interactions with protein residues both inside the tunnel (Figure 4b) and in the entrance or outside of the tunnel (Figure 4c).Inside the tunnel, protein residues with aromatic rings, such as Phe170, Trp177, Phe296, and Phe459, offered π-π interactions, and Met124 and Leu173 provided π-alkyl interactions (Figures 4b and 5a).In the entrance or outside of the tunnel, Val120 and Lys127 formed hydrogen bonds with Netarsudil (q = 0), Val115 and Ile116 offered alkyl interactions, and Leu119 and Val120 provided π-alkyl interactions (Figures 4c and 5a).The positively charged Netarsudil also showed a shallow penetration into the ligand-binding tunnel, whereas the binding pose was different from its neutral state (Figure 5a,b).The catalytic site of ALDH2 was located at the bottom of the ligand-binding tunnel and was adjacent to the cofactor-binding domain.Considering the shallow penetration, Netarsudil was likely not an ideal inhibitor against ALDH2 because it did not fully occupy the ligand-binding tunnel, leaving a space for substrate binding (Figure 4).A similar shallow penetration was observed for both enantiomers of Sarizotan.Unlike Netarsudil, Sequinavir, Troglitazone, and Zeaxanthin penetrated deep into the ligand-binding tunnel.For instance, the cofactor NAD + (residue NDP501) and the inhibitor Troglitazone formed hydrogen bonds (Figure 5c).Glu268, which was located at the bo om of the hydrophobic tunnel, formed van der Waals contacts with Troglitazone (Figure 5c).Another residue, Asn169, at the bo om of the tunnel, provided hydrogen bonding and van der Waals interactions with Troglitazone (Figure 5c) and Zeaxanthin (Figure 5d), respectively.Phe170 and Phe296 were located in the middle of the hydropho-Figure 5. Two-dimensional diagrams of receptor-ligand interactions for the ALDH2 complexes with (a) Netarsudil (q = 0), (b) Netarsudil (q = +1), (c) Troglitazone (q = 0), and (d) Zeaxanthin (q = 0).C, H, O, N, and S atoms of the ligands are colored in gray, white, red, blue, and orange, respectively.The cofactor NAD + is regarded as one of the receptor residues (NDP501).Averaged snapshots from the last 10 ns MD simulations were used to generate the diagrams, and the interaction types are presented by different colors.Left moieties of the inhibitors were inserted into the hydrophobic ligand-binding tunnel of ALDH2.
Unlike Netarsudil, Sequinavir, Troglitazone, and Zeaxanthin penetrated deep into the ligand-binding tunnel.For instance, the cofactor NAD + (residue NDP501) and the inhibitor Troglitazone formed hydrogen bonds (Figure 5c).Glu268, which was located at the bottom of the hydrophobic tunnel, formed van der Waals contacts with Troglitazone (Figure 5c).Another residue, Asn169, at the bottom of the tunnel, provided hydrogen bonding and van der Waals interactions with Troglitazone (Figure 5c) and Zeaxanthin (Figure 5d), respectively.Phe170 and Phe296 were located in the middle of the hydrophobic tunnel and they offered π-π and π-alkyl interactions.Sulfur-containing residues, such as Met174 and Cys303, provided π-sulfur interactions with Troglitazone, while Met174 displayed van der Waals interactions with Netarsudil.Ile116 and Val120 were situated at the entrance of the ligand-binding tunnel and they were able to offer a variety of interactions, such as alkyl, π-alkyl, and hydrogen bonding with the inhibitors (Figure 5).Compared with CVT-10216 (Figure 6a), Zeaxanthin interacted with more protein residues, such as Val115, Ile116, Lys338, and E340 (Figures 5d and 6b).The receptor−ligand-binding energies were further decomposed into the energy contributions per residue for identifying the key residues responsible for the binding.The residues with a contribution of ≥1 kcal/mol for at least one of the neutral inhibitors are presented in Figure 7.In most cases, the protein residues and cofactor NAD + that were located at the bo om, in the middle, or at the entrance of the ligand-binding tunnel favored the ligand-binding process.However, the charged residues, such as Lys112, Lys127, Arg329, and Asp457, appeared to disfavor the binding for all of the neutral inhibitors, except for Asp457, with a favorable contribution to the binding with Netarsudil (Figure 7, top).The disfavor was mainly due to the considerable contributions from polar interactions (i.e., positive ΔGpolar values), as shown in Table S9 in the Supplementary Materials.For instance, Asp457 formed van der Waals contacts with Zeaxanthin (Figures 5d and 6b) with a favorable ΔEMM of −1.45 kcal/mol, while ΔGpolar and ΔGnonpolar amounted to 3.93 and −0.20 kcal/mol, respectively, resulting in a total ΔEbind of 2.28 kcal/mol (Figure 7 and Table S9).The receptor−ligand-binding energies were further decomposed into the energy contributions per residue for identifying the key residues responsible for the binding.The residues with a contribution of ≥1 kcal/mol for at least one of the neutral inhibitors are presented in Figure 7.In most cases, the protein residues and cofactor NAD + that were located at the bottom, in the middle, or at the entrance of the ligand-binding tunnel favored the ligand-binding process.However, the charged residues, such as Lys112, Lys127, Arg329, and Asp457, appeared to disfavor the binding for all of the neutral inhibitors, except for Asp457, with a favorable contribution to the binding with Netarsudil (Figure 7, top).The disfavor was mainly due to the considerable contributions from polar interactions (i.e., positive ∆G polar values), as shown in Table S9 in the Supplementary Materials.For instance, Asp457 formed van der Waals contacts with Zeaxanthin (Figures 5d and 6b) with a favorable ∆E MM of −1.45 kcal/mol, while ∆G polar and ∆G nonpolar amounted to 3.93 and −0.20 kcal/mol, respectively, resulting in a total ∆E bind of 2.28 kcal/mol (Figure 7 and Table S9).Energy contributions (kcal/mol) per residue to the binding of the receptor ALDH2 with the selected neutral (top) and positively charged (bo om) inhibitors of Zeaxanthin (q = 0), Troglitazone (q = 0), Netarsudil (q = 0/+1), Sequinavir (q = +1), and R/S Sarizotan (q = +1).The shown residues contribute ≥ 1 kcal/mol (top) or 4 kcal/mol (bo om) to the binding with at least one of the inhibitors.
For binding with positively charged compounds (q = +1), the key residues were almost the amino acids with a net charge, except for Ala7 (Figure 7, bo om).Ala7 at the Nterminal of ALDH2 was close to the entrance of the ligand-binding tunnel and it disfavored the binding of inhibitors.Because of the electrostatic repulsion, positively charged residues, such as Lys and Arg, disfavored the binding.On the contrary, negatively charged residues, such as Asp and Glu, displayed favorable contributions due to the electrostatic a raction.The cofactor NAD + still favored the binding of the inhibitors to the receptor ALDH2.For these residues, the main contributions were from ΔEMM, while the solvation (ΔGsol) did not contribute much to the binding in most cases (Tables S10 and  S11).

Reference Molecule and Drug Database for the Similarity Search
Molecules with similar chemical structures and/or three-dimensional shapes might have comparable physiological functions and activities and, hence, it would yield a good match with, for instance, the binding site of the target protein (i.e., receptor).Following this principle, we chose three isoflavone analogs of Daidzin, CVT-10216, and CHEMBL114083 (h ps://www.ebi.ac.uk/chembl; accessed on 30 June 2023) as the reference molecules (Figure 1) for the ligand-based similarity search, in order to hit potential drug molecules for the inhibition against ALDH2.
The world-approved drugs were extracted from the ZINC [65,66] online database (h ps://zinc.docking.org/substances/subsets/world;accessed on 30 June 2023).As of the accessed date, there were 5903 approved drugs in major jurisdictions, including the U.S. Food and Drug Administration (FDA).The 3D structures of some drugs were available in the ZINC database and they were given at different pH conditions (Reference: pH = 7.4; Figure 7. Energy contributions (kcal/mol) per residue to the binding of the receptor ALDH2 with the selected neutral (top) and positively charged (bottom) inhibitors of Zeaxanthin (q = 0), Troglitazone (q = 0), Netarsudil (q = 0/+1), Sequinavir (q = +1), and R/S Sarizotan (q = +1).The shown residues contribute ≥ 1 kcal/mol (top) or 4 kcal/mol (bottom) to the binding with at least one of the inhibitors.
For binding with positively charged compounds (q = +1), the key residues were almost the amino acids with a net charge, except for Ala7 (Figure 7, bottom).Ala7 at the N-terminal of ALDH2 was close to the entrance of the ligand-binding tunnel and it disfavored the binding of inhibitors.Because of the electrostatic repulsion, positively charged residues, such as Lys and Arg, disfavored the binding.On the contrary, negatively charged residues, such as Asp and Glu, displayed favorable contributions due to the electrostatic attraction.The cofactor NAD + still favored the binding of the inhibitors to the receptor ALDH2.For these residues, the main contributions were from ∆E MM , while the solvation (∆G sol ) did not contribute much to the binding in most cases (Tables S10 and S11).Molecules with similar chemical structures and/or three-dimensional shapes might have comparable physiological functions and activities and, hence, it would yield a good match with, for instance, the binding site of the target protein (i.e., receptor).Following this principle, we chose three isoflavone analogs of Daidzin, CVT-10216, and CHEMBL114083 (https://www.ebi.ac.uk/chembl; accessed on 30 June 2023) as the reference molecules (Figure 1) for the ligand-based similarity search, in order to hit potential drug molecules for the inhibition against ALDH2.

Computational Methods
The world-approved drugs were extracted from the ZINC [65,66] online database (https://zinc.docking.org/substances/subsets/world;accessed on 30 June 2023).As of the accessed date, there were 5903 approved drugs in major jurisdictions, including the U.S. Food and Drug Administration (FDA).The 3D structures of some drugs were available in the ZINC database and they were given at different pH conditions (Reference: pH = 7.4; Middle: pH = 6.4-8.4;Low: pH = 5.4-6.4;High: pH = 8.4-9.4)[65].If not available, we calculated the structures at pH = 7.4 (i.e., the Reference condition) via the Open Babel toolbox (version 3.1.0)[67].Considering multiple charge states or isoforms at different (pH) conditions, we obtained a set of 7658 drug compounds for use in total, and Daidzin belonged to this set (ZINC ID: ZINC004098610).

The 2D/3D Similarity Search
The open-source RDKit toolkit (version 2022.09.5) [44] was used to generate molecular fingerprints and calculate molecular similarities.Four build-in methods for generating twodimensional (2D) fingerprints were tested, namely, RDKit-specific (topological) fingerprints, MACCS Keys [45], Extended-Connectivity FingerPrints (ECFP4) [46], and Functional-Class FingerPrints FCFP4 (a variant of ECFP).The last two were derived via applying the Morgan algorithm [47] and therefore belonged to a family of Morgan fingerprints (also known as circular fingerprints).The SMILES (simplified molecular input line entry system) format of 5903 drug compounds from the ZINC database [65,66] was converted to the MOL format via the RDKit toolkit [44], and then used for obtaining 2D molecular fingerprints.
Three-dimensional (3D) similarity searches were conducted using the algorithms of E3FP [49] and USRCAT [50].E3FP stands for the Extended 3D FingerPrint, inspired by the commonly used ECFP (2D) fingerprint, and it tightly integrates with the RDKit toolkit (https://anaconda.org/conda-forge/e3fp;accessed on 30 June 2023).USRCAT is an extension of the ultrafast shape recognition (USR) algorithm [51,52] with CREDO Atom Types [53], and it is a built-in module in the RDKit toolkit.The SMILES format of the drug compounds at pH = 7.4 was generated via the Open Babel toolbox (version 3.1.0)[67] and then converted to 3D conformations in the SDF format via the Biovia Discovery studio visualizer software (version 2019).The SDF files were used to generate 3D fingerprints (for E3FPs) or perform a molecular-shape-based similarity evaluation (for USRCAT).
The similarity of the 2D/3D molecular fingerprints was evaluated via a metric of Tanimoto coefficients (Tc), and for the USRCAT method, a metric of similarity scores was used.Both metrics ranged from 0 to 1, and a higher value indicated a higher similarity.In total, we tested six methods for the 2D/3D similarity screening of the world-approved drugs, using three reference compounds (Section 3.1.1).The top 100 hits for each search were merged for a further evaluation.

Ligand and Receptor Preparations
The molecular structures of the hit compounds from the similarity search were taken from the ZINC 15 database [65,66] in the MOL2 format; if not available, the 3D structures were generated from the SMILES files via the Biovia Discovery studio visualizer software (version 2019).A python script (prepare_ligand4.py) in the MGLTools package (version 1.5.6,https://ccsb.scripps.edu/mgltools;accessed on 30 June 2023) [68] was used to prepare the ligand PDBQT files for the docking calculation in batch mode.
The crystal structure of the receptor ALDH2 were retrieved from the Protein Data Bank (PDB) database (PDB code: 2VLE); it formed a complex with Daidzin and the cofactor NAD + was not yet determined [31].Following the previous work [60,69], we constructed a tetramer of ALDH2/Daidzin/NAD + complexes, and NAD + was regarded as one of the receptor residues in the docking.The AutoDockTools module in the MGLTools package (version 1.5.6)[68] was used to generate the receptor PDBQT file via adding polar hydrogens, computing Gasteiger charges, and assigning AD4 atom types.

Docking Calculation
Autodock Vina software (version 1.1.2)[59] was used to conduct docking predictions in the batch mode.The search space was 3 × 3 × 3 nm 3 with center values of x = 9.14, y = 1.65, and z = 7.08 nm, roughly close to the geometrical center of Daidzin.Default values were used for other parameters in the Vina docking.Such a protocol was validated in our previous work, where the docking pose of Daidzin agreed well with its crystal state [69].
The docking was implemented 10 times with random seeds for each ligand, and the binding poses with the strongest binding affinities were used for the data collection.Note that the Vina docking did not use explicit atomic charges for scoring [59], although it used the PDBQT files as inputs.The atomic charges in the PDBQT files were therefore ignored in the Vina scoring.

Toxicity Prediction
The hit compounds after two rounds of screening (i.e., similarly search and docking prediction) were submitted to the ProTox-II online server [70] to predict the toxicity (https://tox-new.charite.de/protox_II,accessed on 30 June 2023).Toxicological endpoints of mutagenicity, carcinogenicity, cytotoxicity, and immunotoxicity were tested as well as the organ toxicity (hepatotoxicity).

Molecular Simulation Protocol
The selected drug inhibitors and reference compounds were subjected to 30 ns molecular dynamics (MD) simulations in the tetramer forms of ALDH2/ligand/NAD + complexes in physiological salt concentrations of 0.15 mol/L.For instance, the length of the simulation cell for CVT-10216 was ca.10.5 nm, containing an ALDH2 tetrameter, 4 ligands (CVT-10216), 4 cofactors (NAD + ), 4 Mg 2+ ions, 123 Na + , 103 Cl − , and 27,758 water molecules.We performed production MD simulations at the NPT ensemble (P = 1 bar; T = 298.15K) using GROMACS software (version 2018.4)[71], with a time step of 0.002 ps.For a comparison with the previous work [60,69], the Amber 99SB-ILDN force field [72] was used to model ALDH2 and the General Amber Force Field (GAFF) [73] was used for the ligands.Force-field parameters of NAD + [74,75] were obtained from a collection by the group of Richard Bryce (http://research.bmh.manchester.ac.uk/bryce/amber; accessed on 30 June 2023).The rigid TIP3P model [76] was used to describe water molecules.Restricted electrostatic potential (RESP) charges were assigned to the ligands.The calculation of the RESP charges and more details on the simulation protocol were presented in our previous work [60,69].

MM-PBSA Analysis
The molecular mechanics Poisson−Boltzmann surface area (MM-PBSA) analysis has the advantage of evaluating the driving forces quantitatively and identifying the key residues responsible for molecular associations.Here, we used the "g_mmpbsa" toolkit [64] to perform such an analysis using the last 10 ns simulation trajectories.Mg 2+ , Na + , and Cl − ions and water molecules were removed, and the atoms that jumped across the simulation box were put back.This ensured that the receptor/ligand molecules remained whole, and this could be performed using the GROMACS tool of "gmx trjconv" with an option of "-pbc nojump" [71].The trajectories were saved with an interval of 100 ps and we obtained 100 snapshots in total for use in the MM-PBSA analysis.
The binding free energy (∆G bind ) of molecular associations, in general, consists of molecular mechanics (MM, ∆E MM ), solvation (∆G sol ), and entropy (∆S) contributions, as given in Equation (3): ∆G bind = ∆E MM + ∆G sol -T∆S = ∆E vdW + ∆E elec + ∆G polar + ∆G nonpolar -T∆S ∆E MM is a sum of van der Waals (∆E vdW ) and electrostatic (∆E elec ) interactions, and ∆G sol can be decomposed into polar (∆G polar ) and nonpolar (∆G nonpolar ) solvation contributions.The entropy calculation has yet not been supported by the "g_mmpbsa" toolkit [64], and we therefore did not consider the entropy contribution; the calculated values are known as the binding energy (∆E bind ).∆G polar was calculated using the built-in APBS package [77], and for ∆G nonpolar , the solvent accessible surface area (SASA) model was used for the calculation.Python scripts of "MmPbSaStat.py"and "MmPbSaDecomp.py",written by the developer team of "g_mmpbsa" (http://rashmikumari.github.io/g_mmpbsa/Usage.htm;accessed on 30 June 2023), were used to address the data statistics and energy decom-

Figure 3 .
Figure 3. Root-mean-square deviations (RMSDs) of protein backbone atoms (a,b) of the ALDH2 tetramer and corresponding monomers (chains A-D) from the crystal structure as a function of simulation time for ALDH2 complexes with the neutral (left) and charged (right) states of netarsudil.The ligand RMSDs from the initial configurations (i.e., docking poses) were calculated excluding hydrogen atoms (c,d).

Figure 3 .
Figure 3. Root-mean-square deviations (RMSDs) of protein backbone atoms (a,b) of the ALDH2 tetramer and corresponding monomers (chains A-D) from the crystal structure as a function of simulation time for ALDH2 complexes with the neutral (left) and charged (right) states of netarsudil.The ligand RMSDs from the initial configurations (i.e., docking poses) were calculated excluding hydrogen atoms (c,d).

Figure 4 .
Figure 4. Representative binding poses of ALDH2 with Netarsudil (q = 0) during MD simulations.The receptor ALDH2 is displayed as a solid ribbon model; Netarsudil is shown as a stick model and its carbon atoms are colored in cyan.Daidzin is represented by the ball and stick model (in black) to depict the hydrophobic tunnel for ligand binding in the crystal structure of protein 2VLE (a).A comparison with Daidzin indicates the relative position of Netarsudil in the ligand-binding tunnel.The cofactor NAD + is shown with the ball and stick model and is colored by element, and the Mg 2+ ion is represented by a green ball.Panel (b) presents the interactions between Netarsudil and protein residues (in green) inside the ligand-binding tunnel, and panel (c) shows the interactions with the residues located in the entrance or outside of the tunnel.Receptor−ligand interactions are indicated

Figure 4 .
Figure 4. Representative binding poses of ALDH2 with Netarsudil (q = 0) during MD simulations.The receptor ALDH2 is displayed as a solid ribbon model; Netarsudil is shown as a stick model and

Figure 5 .
Figure 5. Two-dimensional diagrams of receptor-ligand interactions for the ALDH2 complexes with (a) Netarsudil (q = 0), (b) Netarsudil (q = +1), (c) Troglitazone (q = 0), and (d) Zeaxanthin (q = 0).C, H, O, N, and S atoms of the ligands are colored in gray, white, red, blue, and orange, respectively.The cofactor NAD + is regarded as one of the receptor residues (NDP501).Averaged snapshots from the last 10 ns MD simulations were used to generate the diagrams, and the interaction types are presented by different colors.Left moieties of the inhibitors were inserted into the hydrophobic ligand-binding tunnel of ALDH2.

28 Figure 6 .
Figure 6.The mapping of residue-wise energy contributions (kcal/mol) on the ALDH2 structures for binding with CVT-10216 (a) and Zeaxanthin (b).Carbon atoms of the ligands are colored in cyan.Key residues with a contribution of ≥1 kcal/mol are shown with a stick model and colored by their energy contributions; the blue resides indicate the stabilization of the ligand, while the red ones disfavor the binding.The PDB files containing energy values in the B factor were obtained via the "energy2bfac" module in the "g_mmpbsa" package [64], and were used to generate this figure by PyMOL (h ps://www.pymol.org(accessed on 30 June 2023); version 2.5.4).

Figure 6 .
Figure 6.The mapping of residue-wise energy contributions (kcal/mol) on the ALDH2 structures for binding with CVT-10216 (a) and Zeaxanthin (b).Carbon atoms of the ligands are colored in cyan.Key residues with a contribution of ≥1 kcal/mol are shown with a stick model and colored by their energy contributions; the blue resides indicate the stabilization of the ligand, while the red ones disfavor the binding.The PDB files containing energy values in the B factor were obtained via the "energy2bfac" module in the "g_mmpbsa" package [64], and were used to generate this figure by PyMOL (https://www.pymol.org(accessed on 30 June 2023); version 2.5.4).

Table 1 .
The number of hit compounds with a strong binding strength (∆E dock ≤ −10 kcal/mol) using different methods and reference molecules.

Table 3 .
Toxicity predictions for the 42 selected compounds in Table2via the ProTox-II platform.
ZINC ID is the compound ID in the ZINC database.Toxicity predictions include hepatotoxicity (dili for short), carcinogenicity (carcino), immunotoxicity (immune), mutagenicity (mutagen), and cytotoxicity (cyto); N indicates inactive and Y indicates active.The numbers given in parenthesis are the confidence values for the predictions.The last column indicates whether the compound has been approved by the U.S. Food and Drug Administration (FDA) or not.

Table 5 .
Decomposition of the binding energies (kcal/mol) of the selected inhibitors with ALDH2 using the MM-PBSA analysis of the last 10 ns simulation trajectories.